Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

نویسندگان

Sridhar Mahadevan

Mauro Maggioni

چکیده

This paper introduces a novel spectral framework for solving Markov decision processes (MDPs) by jointly learning representations and optimal policies. The major components of the framework described in this paper include: (i) A general scheme for constructing representations or basis functions by diagonalizing symmetric diffusion operators (ii) A specific instantiation of this approach where global basis functions called proto-value functions (PVFs) are formed using the eigenvectors of the graph Laplacian on an undirected graph formed from state transitions induced by the MDP (iii) A three-phased procedure called representation policy iteration comprising of a sample collection phase, a representation learning phase that constructs basis functions from samples, and a final parameter estimation phase that determines an (approximately) optimal policy within the (linear) subspace spanned by the (current) basis functions. (iv) A specific instantiation of the RPI framework using least-squares policy iteration (LSPI) as the parameter estimation method (v) Several strategies for scaling the proposed approach to large discrete and continuous state spaces, including the Nyström extension for out-of-sample interpolation of eigenfunctions, and the use of Kronecker sum factorization to construct compact eigenfunctions in product spaces such as factored MDPs (vi) Finally, a series of illustrative discrete and continuous control tasks, which both illustrate the concepts and provide a benchmark for evaluating the proposed approach. Many challenges remain to be addressed in scaling the proposed framework to large MDPs, and several elaboration of the proposed framework are briefly summarized at the end.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Representation and Control in Continuous Markov Decision Processes

This paper presents a novel framework for simultaneously learning representation and control in continuous Markov decision processes. Our approach builds on the framework of proto-value functions, in which the underlying representation or basis functions are automatically derived from a spectral analysis of the state space manifold. The proto-value functions correspond to the eigenfunctions of ...

متن کامل

Proto-transfer Learning in Markov Decision Processes Using Spectral Methods

In this paper we introduce proto-transfer leaning, a new framework for transfer learning. We explore solutions to transfer learning within reinforcement learning through the use of spectral methods. Proto-value functions (PVFs) are basis functions computed from a spectral analysis of random walks on the state space graph. They naturally lead to the ability to transfer knowledge and representati...

متن کامل

Proto-transfer Learning in Markov Decision Processes Using Spectral Methods

متن کامل

Proto-transfer Learning in Markov Decision Processes Using Spectral Methods

متن کامل

Learning Representation and Control in Markov Decision Processes: New Frontiers

This paper describes a novel machine learning framework for solving sequential decision problems called Markov decision processes (MDPs) by iteratively computing low-dimensional representations and approximately optimal policies. A unified mathematical framework for learning representation and optimal control in MDPs is presented based on a class of singular operators called Laplacians, whose m...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Journal of Machine Learning Research

دوره 8 شماره

صفحات -

تاریخ انتشار 2007

Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

نویسندگان

چکیده

منابع مشابه

Learning Representation and Control in Continuous Markov Decision Processes

Proto-transfer Learning in Markov Decision Processes Using Spectral Methods

Proto-transfer Learning in Markov Decision Processes Using Spectral Methods

Proto-transfer Learning in Markov Decision Processes Using Spectral Methods

Learning Representation and Control in Markov Decision Processes: New Frontiers

عنوان ژورنال:

اشتراک گذاری